12 January 2015

Purpose of the workshop

  • To introduce the concept of IRT
  • To present the 1-PL/Rasch, 2-PL, and 3-PL models

What is item response theory?

  • A measurement perspective
  • What is measurement?
  • Assignment of numerals to objects or events leading to different scales and kinds of measurements
  • The process by which we attempt to understand a variable, which could be directly unobservable

Latent variables

Measuring mathematical knowledge

How Do We Measure Math?

What Do We Want in Our Instruments?

Scores that are invariant to our instrument

IRT

  • Models

  • Link manifest variables with latent variables

  • Latent characteristics of individuals and items are predictors of observed responses

  • Not a "how" or "why" theory

Properties of IRT

  • Manifest variables differentiate among persons at different locations on the latent scale

  • Items are characterized by location and ability to discriminate among persons

  • Items and persons are on the same scale

  • Parameters estimated in a sample are linearly transformable to estimates of those parameters from another sample

  • Yields scores that are independent of number of items, item difficulty, and the individuals it is measured on, and are placed on a real-number scale

Assumptions of IRT

Response of a person to an item can be modeled with a specific item reponse function

Other traits

  • Test information function for designing a test
  • Methods for examining item and person misfit
  • Adaptive testing can be implemented (e.g. CAT)
  • IRT is a family of models for various response types and could be used with multidimensional data.

IRT conceptually

IRF conceptually

Rasch model

The logistic model

\(p(x = 1 | z) = \frac{e^z}{1 - e^z}\)

The logistic regression model

\(p(x = 1 | g) = \frac{e^{\beta_0 + \beta_1g}}{1 - e^{\beta_0 + \beta_1g}}\)

The Rasch model

\(p(x_j = 1 | \theta, b_j) = \frac{e^{\theta - b_j}}{1 - e^{\theta - b_j}}\)

So, the Rasch model is just the logistic regression model in disguise

What does \(\theta - b_j\) mean

rasch <- function(person, item) {
exp(person - item)/(1 + exp(person - item))
}
rasch(person = 1, item = 1.5)
## [1] 0.3775407
rasch <- function(person, item) {
exp(person - item)/(1 + exp(person - item))
}
rasch(person = 1, item = 1)
## [1] 0.5

Exploring the Rasch

The 1-PL model

The 1-PL model

\(p(x_j = 1 | a,\theta, b_j) = \frac{e^{a(\theta - b_j)}}{1 - e^{a(\theta - b_j)}}\)

  • a is the item discrimination
  • What is a in the Rasch?
  • Where is the subscript for a?

The 1-PL model in action

Calculating ability estimates

  1. First, calculate the probability of each response for a respondent.
  2. Second, determine probability of the response pattern (just the product b/c of local independence)
  3. Repeat #1 and #2 for \(\theta\) between -4 and 4
  4. Fourth, select the \(\theta\) with the highest likelihood of producing the pattern, typicall the log-likelihood.

Doing it in R

  • Assume, that \(b_1\) = 2, \(b_2\) = 1.2, and \(b_3\) = 2.5
  • What is the most likely pattern that would give raise to a 011?
person <- seq(from= -4, to = 4, by = .1)
item1 <- 2; item2 <- 1.2; item3 <- 2.5
LogLiks <- NULL
for(i in 1:length(person)){
  p1 <- 1-rasch(person = person[i],item = item1)
  p2 <- rasch(person = person[i],item = item2)
  p3 <- rasch(person = person[i],item = item3)
  LogLiks[i] <- log(p1*p2*p3)
  }
plot(LogLiks person,type = "l",xlab = "Ability",
     ylab = "Log-Likelihood")
abline(v = person[which.max(LogLiks)],lty=2)

What it looks like

This is just a point-estimate though …

  • The standard error of estimate (SEE) represents our degree of uncertainty about the location of a person.
  • The larger the SEE, the more uncertain we are.
  • This is the same thing as a standard error in statistics and you could use it to create 95% confidence intervals around person and item parameters.

Information, lots of it!

  • Information is the inverse of the SEE
  • The smaller the SEE, the more precise we are about where a person is located
  • For the 1-PL, the information function is unimodal, symmetric, and max information occurs at \(b\)
  • And we can sum these up!!!

Plotting information

The 2-PL and the 3-PL

The 2-PL model

\(p(x_j = 1 | \theta, a_j, b_j) = \frac{e^{a_j(\theta - b_j)}}{1 - e^{a_j(\theta - b_j)}}\)

The 2-PL model

\(p(x_j = 1 | \theta, a_j, b_j, c_k) = c_j + (1 - c_j)\frac{e^{a_j(\theta - b_j)}}{1 - e^{a_j(\theta - b_j)}}\)

  • \(c_j\) is the guessing parameter, or lower asympotate, and \(a_j\), we've already seen
  • Our item location is now half between \(c_j\) and the upper asympotate
  • \(a_j\) gives us more information at \(b_j\) when it's greater than 1

What does the 3-PL look like?

Which one to choose?

  • Sample size considerations, Rasch estimates less parameters
  • Does guessing make sense?
  • Can test empirically in R

One problem with these models

Recall

\(p(x_j = 1 | \theta, b_j) = \frac{e^{\theta - b_j}}{1 - e^{\theta - b_j}}\)

  • But we don't know \(\theta\) or \(b_j\)

So there are an infinite number of solutions!

IRT models in R with irtoys

p.2pl <- est(Scored, model="2PL", engine="ltm")
cbind(p.2pl$est[1:2,],p.2pl$se[1:2,])
##             [,1]       [,2] [,3]      [,4]       [,5] [,6]
## Item 1 0.6326689 -2.0005810    0 0.1359665 0.41104688    0
## Item 2 1.5469622 -0.2681548    0 0.1928367 0.08792123    0
th.eap <- eap(resp=Scored, ip=Scored2pl$est, qu=normal.qu())
th.eap[1:2,]
##             est       sem  n
## [1,]  0.8083981 0.4996128 18
## [2,] -1.3387368 0.4051805 18

Other models

  • If your response data are polytomous, there are polytomous IRT models.
  • Basically generalizations of the Rasch and 2-PL to these settings.
  • Multidimensional data, no problem. Fit a MIRT model, bifactor model, etc, with mirt.